skip to main content
10.1145/3465084.3467902acmconferencesArticle/Chapter ViewAbstractPublication PagespodcConference Proceedingsconference-collections
research-article
Public Access

Approximate Byzantine Fault-Tolerance in Distributed Optimization

Authors Info & Claims
Published:23 July 2021Publication History

ABSTRACT

This paper considers the problem of Byzantine fault-tolerance in distributed multi-agent optimization. In this problem, each agent has a local cost function, and in the fault-free case, the goal is to design a distributed algorithm that allows all the agents to find a minimum point of all the agents' aggregate cost function. We consider a scenario where some agents might be Byzantine faulty that renders the original goal of computing a minimum point of all the agents' aggregate cost vacuous. A more reasonable objective for an algorithm in this scenario is to allow all the non-faulty agents to compute the minimum point of only the non-faulty agents' aggregate cost. Prior work shows that if there are up to f (out of n) Byzantine agents then a minimum point of the non-faulty agents' aggregate cost can be computed exactly if and only if the non-faulty agents' costs satisfy a certain redundancy property called 2f-redundancy. However, 2f-redundancy is an ideal property that can be satisfied only in systems free from noise or uncertainties, which can make the goal of exact fault-tolerance unachievable in some applications. Thus, we introduce the notion of (f,ε)-resilience, a generalization of exact fault-tolerance wherein the objective is to find an approximate minimum point of the non-faulty aggregate cost, with ε accuracy. This approximate fault-tolerance can be achieved under a weaker condition that is easier to satisfy in practice, compared to 2f-redundancy. We obtain necessary and sufficient conditions for achieving (f, ε)-resilience characterizing the correlation between relaxation in redundancy and approximation in resilience. In case when the agents' cost functions are differentiable, we obtain conditions for (f, ε)-resilience of the distributed gradient-descent method when equipped with robust gradient aggregation; such as comparative gradient elimination or coordinate-wise trimmed mean.

Skip Supplemental Material Section

Supplemental Material

podc-21-full~1.mp4

mp4

64.2 MB

References

  1. Dan Alistarh, Zeyuan Allen-Zhu, and Jerry Li. 2018. Byzantine Stochastic Gradient Descent. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/file/ a07c2f3b3b907aaf8436a26c6d77f0a2-Paper.pdfGoogle ScholarGoogle Scholar
  2. Takeshi Amemiya. 1985. Advanced econometrics. Harvard university press.Google ScholarGoogle Scholar
  3. Jeremy Bernstein, Jiawei Zhao, Kamyar Azizzadenesheli, and Anima Anandkumar. 2019. signSGD with Majority Vote is Communication Efficient And Fault Tolerant. arXiv:cs.DC/1810.05291Google ScholarGoogle Scholar
  4. Dimitri P Bertsekas and John N Tsitsiklis. 1989. Parallel and distributed computation: numerical methods. Vol. 23. Prentice hall Englewood Cliffs, NJ.Google ScholarGoogle Scholar
  5. Kush Bhatia, Prateek Jain, and Purushottam Kar. 2015. Robust Regression via Hard Thresholding. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'15). MIT Press, Cambridge, MA, USA, 721--729.Google ScholarGoogle Scholar
  6. Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. 2017. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 118--128.Google ScholarGoogle Scholar
  7. Léon Bottou, Frank E Curtis, and Jorge Nocedal. 2018. Optimization methods for large-scale machine learning. Siam Review 60, 2 (2018), 223--311.Google ScholarGoogle ScholarCross RefCross Ref
  8. Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning 3, 1 (Jan. 2011), 1--122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xinyang Cao and Lifeng Lai. 2019. Distributed gradient descent algorithm robust to an arbitrary number of byzantine attackers. IEEE Transactions on Signal Processing 67, 22 (2019), 5850--5864.Google ScholarGoogle ScholarCross RefCross Ref
  11. Moses Charikar, Jacob Steinhardt, and Gregory Valiant. 2017. Learning from Untrusted Data. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2017). Association for Computing Machinery, New York, NY, USA, 47--60. https://doi.org/10.1145/3055399.3055491Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yuan Chen, Soummya Kar, and José M. F. Moura. 2018. Resilient Distributed Estimation Through Adversary Detection. IEEE Transactions on Signal Processing 66, 9 (2018), 2455--2469.Google ScholarGoogle ScholarCross RefCross Ref
  13. Yudong Chen, Lili Su, and Jiaming Xu. 2017. Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 2 (2017), 44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michelle S Chong, Masashi Wakaiki, and Joao P Hespanha. 2015. Observability of linear systems under adversarial attacks. In American Control Conference. IEEE, 2439--2444.Google ScholarGoogle ScholarCross RefCross Ref
  15. Georgios Damaskinos, El Mahdi El Mhamdi, Rachid Guerraoui, Rhicheek Patra, and Mahsa Taziki. 2018. Asynchronous Byzantine Machine Learning (the case of SGD). In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), JenniferDy and Andreas Krause (Eds.), Vol. 80. PMLR, 1145--1154. http://proceedings.mlr.press/v80/damaskinos18a.htmlGoogle ScholarGoogle Scholar
  16. Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, and Alistair Stewart. 2019. Sever: A Robust Meta-Algorithm for Stochastic Optimization. arXiv:cs.LG/1803.02815Google ScholarGoogle Scholar
  17. John C Duchi, Alekh Agarwal, and Martin J Wainwright. 2011. Dual averaging for distributed optimization: Convergence analysis and network scaling. IEEE Transactions on Automatic control 57, 3 (2011), 592--606.Google ScholarGoogle ScholarCross RefCross Ref
  18. El Mahdi El Mhamdi, Rachid Guerraoui, and Sébastien Rouault. 2018. The Hidden Vulnerability of Distributed Learning in Byzantium. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, 3521--3530. http://proceedings.mlr.press/v80/mhamdi18a.htmlGoogle ScholarGoogle Scholar
  19. Hamza Fawzi, Paulo Tabuada, and Suhas Diggavi. 2014. Secure estimation and control for cyber-physical systems under adversarial attacks. IEEE Transactions on Automatic control 59, 6 (2014), 1454--1467.Google ScholarGoogle ScholarCross RefCross Ref
  20. Jiashi Feng, Huan Xu, and Shie Mannor. 2015. Distributed Robust Learning. arXiv:stat.ML/1409.5937Google ScholarGoogle Scholar
  21. Nirupam Gupta, Shuo Liu, and Nitin H. Vaidya. 2021. Byzantine Fault-Tolerant Distributed Machine Learning Using Stochastic Gradient Descent (SGD) and Norm-Based Comparative Gradient Elimination (CGE). arXiv:cs.LG/2008.04699Google ScholarGoogle Scholar
  22. Nirupam Gupta and Nitin H. Vaidya. 2019. Byzantine Fault Tolerant Distributed Linear Regression. arXiv:cs.LG/1903.08752Google ScholarGoogle Scholar
  23. Nirupam Gupta and Nitin H. Vaidya. 2019. Byzantine Fault-Tolerant Parallelized Stochastic Gradient Descent for Linear Regression. In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 415--420. https://doi.org/10.1109/ALLERTON.2019.8919735Google ScholarGoogle Scholar
  24. Nirupam Gupta and Nitin H. Vaidya. 2020. Fault-Tolerance in Distributed Optimization: The Case of Redundancy. In Proceedings of the 39th Symposium on Principles of Distributed Computing (PODC '20). Association for Computing Machinery, New York, NY, USA, 365--374. https://doi.org/10.1145/3382734.3405748Google ScholarGoogle Scholar
  25. Nirupam Gupta and Nitin H. Vaidya. 2020. Resilience in Collaborative Optimization: Redundant and Independent Cost Functions. arXiv:cs.DC/2003.09675Google ScholarGoogle Scholar
  26. Sai Praneeth Karimireddy, Lie He, and Martin Jaggi. 2020. Learning from History for Byzantine Robust Optimization. arXiv:cs.LG/2012.10333Google ScholarGoogle Scholar
  27. Kananart Kuwaranancharoen, Lei Xin, and Shreyas Sundaram. 2020. Byzantineresilient distributed optimization of multi-dimensional functions. In 2020 American Control Conference (ACC). IEEE, 4399--4404.Google ScholarGoogle ScholarCross RefCross Ref
  28. Leslie Lamport, Robert Shostak, and Marshall Pease. 2019. The Byzantine Generals Problem. Association for Computing Machinery, New York, NY, USA, 203--226.Google ScholarGoogle Scholar
  29. Shuo Liu, Nirupam Gupta, and Nitin H. Vaidya. 2021. Approximate Byzantine Fault-Tolerance in Distributed Optimization. arXiv:cs.DC/2101.09337Google ScholarGoogle Scholar
  30. Nancy A Lynch. 1996. Distributed algorithms. Elsevier.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shaunak Mishra, Yasser Shoukry, Nikhil Karamchandani, Suhas N Diggavi, and Paulo Tabuada. 2016. Secure state estimation against sensor attacks in the presence of noise. IEEE Transactions on Control of Network Systems 4, 1 (2016), 49--59.Google ScholarGoogle ScholarCross RefCross Ref
  32. James R Munkres. 2000. Topology. Prentice Hall Upper Saddle River, NJ.Google ScholarGoogle Scholar
  33. Angelia Nedic and Asuman Ozdaglar. 2009. Distributed Subgradient Methods for Multi-Agent Optimization. IEEE Trans. Automat. Control 54, 1 (2009), 48--61.Google ScholarGoogle ScholarCross RefCross Ref
  34. Miroslav Pajic, Insup Lee, and George J Pappas. 2017. Attack-resilient state estimation for noisy dynamical systems. IEEE Transactions on Control of Network Systems 4, 1 (2017), 82--92.Google ScholarGoogle ScholarCross RefCross Ref
  35. Miroslav Pajic, JamesWeimer, Nicola Bezzo, Paulo Tabuada, Oleg Sokolsky, Insup Lee, and George J Pappas. 2014. Robustness of attack-resilient state estimators. In ICCPS'14: ACM/IEEE 5th International Conference on Cyber-Physical Systems (with CPS Week 2014). IEEE, 163--174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, and Pradeep Ravikumar. 2018. Robust Estimation via Robust Gradient Estimation. arXiv:stat.ML/1802.06485Google ScholarGoogle Scholar
  37. Michael Rabbat and Robert Nowak. 2004. Distributed optimization in sensor networks. In Proceedings of the 3rd international symposium on Information processing in sensor networks. IEEE, 20--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Robin L Raffard, Claire J Tomlin, and Stephen P Boyd. 2004. Distributed optimization for cooperative agents: Application to formation flight. In 2004 43rd IEEE Conference on Decision and Control (CDC), Vol. 3. IEEE, 2453--2459.Google ScholarGoogle ScholarCross RefCross Ref
  39. Yasser Shoukry, Pierluigi Nuzzo, Alberto Puggelli, Alberto L Sangiovanni- Vincentelli, Sanjit A Seshia, Mani Srivastava, and Paulo Tabuada. 2015. Imhotep- SMT: A satisfiability modulo theory solver for secure state estimation. In Proc. Int. Workshop on Satisfiability Modulo Theories.Google ScholarGoogle Scholar
  40. Yasser Shoukry, Pierluigi Nuzzo, Alberto Puggelli, Alberto L Sangiovanni- Vincentelli, Sanjit A Seshia, and Paulo Tabuada. 2017. Secure state estimation for cyber-physical systems under sensor attacks: A satisfiability modulo theory approach. IEEE Trans. Automat. Control 62, 10 (2017), 4917--4932.Google ScholarGoogle ScholarCross RefCross Ref
  41. Jacob Steinhardt, Moses Charikar, and Gregory Valiant. 2017. Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. arXiv:cs.LG/1703.04940Google ScholarGoogle Scholar
  42. Lili Su and Shahin Shahrampour. 2018. Finite-time Guarantees for Byzantine-Resilient Distributed State Estimation with Noisy Measurements. arXiv:cs.SY/1810.10086Google ScholarGoogle Scholar
  43. Lili Su and Nitin H. Vaidya. 2016. Fault-Tolerant Multi-Agent Optimization: Optimal Iterative Distributed Algorithms (PODC '16). Association for Computing Machinery, New York, NY, USA, 425--434. https://doi.org/10.1145/2933057.2933105Google ScholarGoogle Scholar
  44. Lili Su and Nitin H. Vaidya. 2016. Non-Bayesian Learning in the Presence of Byzantine Agents. In Distributed Computing. Springer Berlin Heidelberg, Berlin, Heidelberg, 414--427.Google ScholarGoogle Scholar
  45. Lili Su and Nitin H. Vaidya. 2021. Byzantine-Resilient Multiagent Optimization. IEEE Trans. Automat. Control 66, 5 (2021), 2227--2233.Google ScholarGoogle ScholarCross RefCross Ref
  46. Cong Xie, Oluwasanmi Koyejo, and Indranil Gupta. 2018. Generalized Byzantinetolerant SGD. arXiv:cs.DC/1802.10116Google ScholarGoogle Scholar
  47. Zhixiong Yang and Waheed U. Bajwa. 2017. ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning. arXiv:cs.LG/1708.08155Google ScholarGoogle Scholar
  48. Dong Yin, Yudong Chen, Ramchandran Kannan, and Peter Bartlett. 2018. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, 5650--5659. http://proceedings.mlr.press/v80/yin18a.htmlGoogle ScholarGoogle Scholar

Index Terms

  1. Approximate Byzantine Fault-Tolerance in Distributed Optimization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODC'21: Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing
      July 2021
      590 pages
      ISBN:9781450385480
      DOI:10.1145/3465084

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 July 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate740of2,477submissions,30%

      Upcoming Conference

      PODC '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader